[Metal] Make fast math mode optional #3205

ivarflakstad · 2025-11-20T19:57:37Z

Set math mode via old api if macos version < 15 or ios < 18.

…ed on os version.

jacobgorm · 2025-11-21T13:25:36Z

My build now completes on older MacOS, which is awesome :)

However, It seems to me that the fast-math flag check should be moved to build-time, as having to set an envvar in production is cumbersome and error-prone. I've asked Codex and it it suggests these two approaches, first is IMO cleanest via a feature flag:

diff --git a/candle-metal-kernels/Cargo.toml b/candle-metal-kernels/Cargo.toml
index 8ce09a32..c616d87d 100644
--- a/candle-metal-kernels/Cargo.toml
+++ b/candle-metal-kernels/Cargo.toml
@@ -9,6 +9,9 @@ keywords = ["blas", "tensor", "machine-learning"]
 categories = ["science"]
 license = "MIT OR Apache-2.0"
 
+[features]
+default = ["fast-math"]
+fast-math = []
 
 [dependencies]
 half = { version = "2.5.0", features = [
diff --git a/candle-metal-kernels/src/kernel.rs b/candle-metal-kernels/src/kernel.rs
index f941e302..bebdd187 100644
--- a/candle-metal-kernels/src/kernel.rs
+++ b/candle-metal-kernels/src/kernel.rs
@@ -2,7 +2,6 @@ use crate::source::{
     AFFINE, BINARY, CAST, CONV, FILL, INDEXING, MLX_GEMM, MLX_SORT, QUANTIZED, RANDOM, REDUCE,
     SDPA, SORT, TERNARY, UNARY,
 };
-use crate::utils::get_env_bool;
 use crate::{
     ComputePipeline, ConstantValues, Device, Function, Library, MTLCompileOptions,
     MTLMathFloatingPointFunctions, MTLMathMode, MetalKernelError, Source,
@@ -182,7 +181,7 @@ fn get_compile_options() -> Retained<MTLCompileOptions> {
     let compile_options = MTLCompileOptions::new();
     //unsafe { compile_options.setEnableLogging(true) };
 
-    let fast_math_enabled = get_env_bool("CANDLE_METAL_ENABLE_FAST_MATH", true);
+    let fast_math_enabled = cfg!(feature = "fast-math");
     // Ref availability:
     // https://developer.apple.com/documentation/metal/mtlcompileoptions/mathmode
     if available!(macos = 15, ios = 18) {
diff --git a/candle-metal-kernels/src/utils.rs b/candle-metal-kernels/src/utils.rs
index 1ad647d7..8cf4006a 100644
--- a/candle-metal-kernels/src/utils.rs
+++ b/candle-metal-kernels/src/utils.rs
@@ -1,6 +1,5 @@
 use crate::metal::{Buffer, CommandBuffer, ComputeCommandEncoder, ComputePipeline};
 use crate::MTLSize;
-use std::ffi::OsStr;
 use std::ops::Deref;
 use std::sync::{RwLockReadGuard, RwLockWriteGuard};
 
@@ -236,15 +235,3 @@ impl<'a, T> From<RwLockWriteGuard<'a, T>> for RwLockGuard<'a, T> {
     fn from(g: RwLockWriteGuard<'a, T>) -> Self {
         RwLockGuard::Write(g)
     }
-}
-
-fn is_truthy(s: String) -> bool {
-    match s.as_str() {
-        "true" | "t" | "yes" | "y" | "1" => true,
-        _ => false,
-    }
-}
-
-pub(crate) fn get_env_bool<K: AsRef<OsStr>>(key: K, default: bool) -> bool {
-    std::env::var(key).map(is_truthy).unwrap_or(default)
-}

and second keeps the envvar, but resolves it during build:

diff --git a/candle-metal-kernels/src/kernel.rs b/candle-metal-kernels/src/kernel.rs
index f941e302..6c07cf0c 100644
--- a/candle-metal-kernels/src/kernel.rs
+++ b/candle-metal-kernels/src/kernel.rs
@@ -2,7 +2,6 @@ use crate::source::{
     AFFINE, BINARY, CAST, CONV, FILL, INDEXING, MLX_GEMM, MLX_SORT, QUANTIZED, RANDOM, REDUCE,
     SDPA, SORT, TERNARY, UNARY,
 };
-use crate::utils::get_env_bool;
 use crate::{
     ComputePipeline, ConstantValues, Device, Function, Library, MTLCompileOptions,
     MTLMathFloatingPointFunctions, MTLMathMode, MetalKernelError, Source,
@@ -182,7 +181,7 @@ fn get_compile_options() -> Retained<MTLCompileOptions> {
     let compile_options = MTLCompileOptions::new();
     //unsafe { compile_options.setEnableLogging(true) };
 
-    let fast_math_enabled = get_env_bool("CANDLE_METAL_ENABLE_FAST_MATH", true);
+    let fast_math_enabled = FAST_MATH_ENABLED;
     // Ref availability:
     // https://developer.apple.com/documentation/metal/mtlcompileoptions/mathmode
     if available!(macos = 15, ios = 18) {
@@ -200,3 +199,9 @@ fn get_compile_options() -> Retained<MTLCompileOptions> {
     }
     compile_options
 }
+
+const FAST_MATH_ENABLED: bool = match option_env!("CANDLE_METAL_ENABLE_FAST_MATH") {
+    Some("true") | Some("t") | Some("yes") | Some("y") | Some("1") => true,
+    Some(_) => false,
+    None => true,
+};
diff --git a/candle-metal-kernels/src/utils.rs b/candle-metal-kernels/src/utils.rs
index 1ad647d7..8cf4006a 100644
--- a/candle-metal-kernels/src/utils.rs
+++ b/candle-metal-kernels/src/utils.rs
@@ -1,6 +1,5 @@
 use crate::metal::{Buffer, CommandBuffer, ComputeCommandEncoder, ComputePipeline};
 use crate::MTLSize;
-use std::ffi::OsStr;
 use std::ops::Deref;
 use std::sync::{RwLockReadGuard, RwLockWriteGuard};
 
@@ -236,15 +235,3 @@ impl<'a, T> From<RwLockWriteGuard<'a, T>> for RwLockGuard<'a, T> {
     fn from(g: RwLockWriteGuard<'a, T>) -> Self {
         RwLockGuard::Write(g)
     }
-}
-
-fn is_truthy(s: String) -> bool {
-    match s.as_str() {
-        "true" | "t" | "yes" | "y" | "1" => true,
-        _ => false,
-    }
-}
-
-pub(crate) fn get_env_bool<K: AsRef<OsStr>>(key: K, default: bool) -> bool {
-    std::env::var(key).map(is_truthy).unwrap_or(default)
-}

ivarflakstad · 2025-11-21T13:39:13Z

Great!

Wrt having the flag set at buildtime I disagree.

We don't have to supply the env var at all as we have a default.

We want to be able to toggle it easily at runtime so that we can verify the correctness of the model implementations on metal.
We don't want to rebuild everything just for this kind of verification/experimentation.

ivarflakstad · 2025-11-21T13:44:29Z

Additionally, I have a sneaking suspicion that this flag will be removed and that we'll disable fast math in the future. I haven't noticed any meaningful performance improvements from having it enabled vs not.

jacobgorm · 2025-11-21T14:00:35Z

I am happy as long as it unbreaks the build and is on by default :). In my experience, fast-math makes a huge difference on CPU for fp32 models, but not sure how much it matters for fp16 on GPU, and probably not at all for quantized models. That said, I have never seen any adverse effects of having it on.

Wrt this being a build-time setting, I have always preferred to compile my shaders ahead of time, so there is naturally becomes part of the build

Add ability to toggle fast math mode in metal. Chose how to apply bas…

809eb8f

…ed on os version.

ivarflakstad mentioned this pull request Nov 20, 2025

Lost support for MacOS < 15.0 Sequoia #3185

Open

Move available macro and friends to utils

8b670c2

ivarflakstad requested review from ArthurZucker and pcuenca November 20, 2025 22:07

ivarflakstad marked this pull request as ready for review November 20, 2025 22:07

ivarflakstad added 3 commits November 20, 2025 23:10

Isolate #[allow(deprecated)] to the actually deprecated method

b970a40

doc

1fc2392

Use objc2::available macro instead

d8b1780

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Metal] Make fast math mode optional #3205

[Metal] Make fast math mode optional #3205

Uh oh!

ivarflakstad commented Nov 20, 2025 •

edited

Loading

Uh oh!

jacobgorm commented Nov 21, 2025 •

edited

Loading

Uh oh!

ivarflakstad commented Nov 21, 2025

Uh oh!

ivarflakstad commented Nov 21, 2025

Uh oh!

jacobgorm commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Metal] Make fast math mode optional #3205

Are you sure you want to change the base?

[Metal] Make fast math mode optional #3205

Uh oh!

Conversation

ivarflakstad commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacobgorm commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ivarflakstad commented Nov 21, 2025

Uh oh!

ivarflakstad commented Nov 21, 2025

Uh oh!

jacobgorm commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ivarflakstad commented Nov 20, 2025 •

edited

Loading

jacobgorm commented Nov 21, 2025 •

edited

Loading